Overview

Dataset statistics

Number of variables15
Number of observations456240
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory30.9 MiB
Average record size in memory71.0 B

Variable types

Numeric10
Boolean2
Categorical3

Warnings

preco_checkout is highly correlated with preco_baseHigh correlation
preco_base is highly correlated with preco_checkoutHigh correlation
preco_checkout is highly correlated with preco_baseHigh correlation
preco_base is highly correlated with preco_checkoutHigh correlation
preco_checkout is highly correlated with preco_baseHigh correlation
preco_base is highly correlated with preco_checkoutHigh correlation
df_index is highly correlated with cidade_codigo and 3 other fieldsHigh correlation
pagina_destaque is highly correlated with email_para_promocaoHigh correlation
cidade_codigo is highly correlated with df_index and 4 other fieldsHigh correlation
preco_checkout is highly correlated with preco_base and 3 other fieldsHigh correlation
preco_base is highly correlated with preco_checkout and 3 other fieldsHigh correlation
qtd_pedido is highly correlated with categoriaHigh correlation
cozinha is highly correlated with preco_checkout and 3 other fieldsHigh correlation
refeicao_id is highly correlated with preco_checkout and 3 other fieldsHigh correlation
regiao_codigo is highly correlated with df_index and 2 other fieldsHigh correlation
loja_tipo is highly correlated with cidade_codigo and 1 other fieldsHigh correlation
loja_id is highly correlated with df_index and 3 other fieldsHigh correlation
email_para_promocao is highly correlated with pagina_destaqueHigh correlation
op_area is highly correlated with df_index and 3 other fieldsHigh correlation
categoria is highly correlated with preco_checkout and 4 other fieldsHigh correlation
cozinha is highly correlated with categoriaHigh correlation
categoria is highly correlated with cozinhaHigh correlation
df_index is uniformly distributed Uniform
df_index has unique values Unique

Reproduction

Analysis started2021-08-29 18:36:51.815858
Analysis finished2021-08-29 18:37:36.757318
Duration44.94 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct456240
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean228264.5684
Minimum0
Maximum456547
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2021-08-29T15:37:36.863038image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile22825.95
Q1114138.75
median228260.5
Q3342376.25
95-th percentile433719.05
Maximum456547
Range456547
Interquartile range (IQR)228237.5

Descriptive statistics

Standard deviation131785.5114
Coefficient of variation (CV)0.5773366944
Kurtosis-1.199740008
Mean228264.5684
Median Absolute Deviation (MAD)114119
Skewness0.0001090528568
Sum1.041434267 × 1011
Variance1.7367421 × 1010
MonotonicityStrictly increasing
2021-08-29T15:37:36.993689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
2486201
 
< 0.1%
2015271
 
< 0.1%
2035741
 
< 0.1%
1974291
 
< 0.1%
1994761
 
< 0.1%
2097151
 
< 0.1%
2117621
 
< 0.1%
2056171
 
< 0.1%
2076641
 
< 0.1%
Other values (456230)456230
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
4565471
< 0.1%
4565461
< 0.1%
4565451
< 0.1%
4565441
< 0.1%
4565431
< 0.1%
4565421
< 0.1%
4565411
< 0.1%
4565401
< 0.1%
4565391
< 0.1%
4565381
< 0.1%

semana
Real number (ℝ≥0)

Distinct145
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean74.78105602
Minimum1
Maximum145
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2021-08-29T15:37:37.129326image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q139
median76
Q3111
95-th percentile139
Maximum145
Range144
Interquartile range (IQR)72

Descriptive statistics

Standard deviation41.51946501
Coefficient of variation (CV)0.555213676
Kurtosis-1.179373225
Mean74.78105602
Median Absolute Deviation (MAD)36
Skewness-0.04959792359
Sum34118109
Variance1723.865975
MonotonicityNot monotonic
2021-08-29T15:37:37.271670image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1223355
 
0.7%
1053348
 
0.7%
1063347
 
0.7%
1403332
 
0.7%
1233328
 
0.7%
1343322
 
0.7%
1333319
 
0.7%
1133312
 
0.7%
1433305
 
0.7%
943303
 
0.7%
Other values (135)422969
92.7%
ValueCountFrequency (%)
12922
0.6%
22896
0.6%
32899
0.6%
42889
0.6%
52810
0.6%
62823
0.6%
72769
0.6%
82785
0.6%
92854
0.6%
102859
0.6%
ValueCountFrequency (%)
1453268
0.7%
1443302
0.7%
1433305
0.7%
1423238
0.7%
1413263
0.7%
1403332
0.7%
1393279
0.7%
1383278
0.7%
1373283
0.7%
1363273
0.7%

loja_id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct77
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean82.10862485
Minimum10
Maximum186
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2021-08-29T15:37:37.420272image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile14
Q143
median76
Q3110
95-th percentile161
Maximum186
Range176
Interquartile range (IQR)67

Descriptive statistics

Standard deviation45.97466228
Coefficient of variation (CV)0.5599248844
Kurtosis-0.805091574
Mean82.10862485
Median Absolute Deviation (MAD)33
Skewness0.3452015749
Sum37461239
Variance2113.669572
MonotonicityNot monotonic
2021-08-29T15:37:37.553916image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
137026
 
1.5%
107006
 
1.5%
526983
 
1.5%
436922
 
1.5%
676904
 
1.5%
1746883
 
1.5%
516873
 
1.5%
1376873
 
1.5%
276849
 
1.5%
1086838
 
1.5%
Other values (67)387083
84.8%
ValueCountFrequency (%)
107006
1.5%
116790
1.5%
137026
1.5%
146040
1.3%
176333
1.4%
206671
1.5%
236433
1.4%
245232
1.1%
265084
1.1%
276849
1.5%
ValueCountFrequency (%)
1865528
1.2%
1775296
1.2%
1746883
1.5%
1624366
1.0%
1615591
1.2%
1575709
1.3%
1536693
1.5%
1525917
1.3%
1495021
1.1%
1466161
1.4%

refeicao_id
Real number (ℝ≥0)

HIGH CORRELATION

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2024.306922
Minimum1062
Maximum2956
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2021-08-29T15:37:37.708502image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1062
5-th percentile1198
Q11558
median1993
Q32539
95-th percentile2760
Maximum2956
Range1894
Interquartile range (IQR)981

Descriptive statistics

Standard deviation547.5381884
Coefficient of variation (CV)0.2704818042
Kurtosis-1.243793678
Mean2024.306922
Median Absolute Deviation (MAD)499
Skewness-0.1726072093
Sum923569790
Variance299798.0677
MonotonicityNot monotonic
2021-08-29T15:37:37.854113image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
106211137
 
2.4%
270711123
 
2.4%
177811121
 
2.4%
110911119
 
2.4%
172711119
 
2.4%
199311115
 
2.4%
196211114
 
2.4%
175411087
 
2.4%
188511080
 
2.4%
258111072
 
2.4%
Other values (41)345153
75.7%
ValueCountFrequency (%)
106211137
2.4%
110911119
2.4%
11984206
 
0.9%
120710806
2.4%
12169695
2.1%
123010745
2.4%
12477184
1.6%
12489939
2.2%
13114682
1.0%
14384385
 
1.0%
ValueCountFrequency (%)
29563319
 
0.7%
28678092
1.8%
282611056
2.4%
276010209
2.2%
270711123
2.4%
27049811
2.2%
26649853
2.2%
264010747
2.4%
263110458
2.3%
258111072
2.4%

preco_checkout
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1990
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean332.3371971
Minimum45.62
Maximum767.33
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2021-08-29T15:37:38.017687image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum45.62
5-th percentile121.28
Q1228.98
median296.82
Q3445.23
95-th percentile640.23
Maximum767.33
Range721.71
Interquartile range (IQR)216.25

Descriptive statistics

Standard deviation152.9378086
Coefficient of variation (CV)0.4601886575
Kurtosis-0.2537255609
Mean332.3371971
Median Absolute Deviation (MAD)104.73
Skewness0.6715232236
Sum151625522.8
Variance23389.97329
MonotonicityNot monotonic
2021-08-29T15:37:38.162828image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
290.037340
 
1.6%
291.037275
 
1.6%
292.037199
 
1.6%
486.036635
 
1.5%
484.036631
 
1.5%
485.036584
 
1.4%
280.335833
 
1.3%
281.335828
 
1.3%
282.335729
 
1.3%
447.235240
 
1.1%
Other values (1980)391946
85.9%
ValueCountFrequency (%)
45.621
 
< 0.1%
47.591
 
< 0.1%
53.411
 
< 0.1%
55.353
< 0.1%
56.261
 
< 0.1%
58.261
 
< 0.1%
64.023
< 0.1%
65.025
< 0.1%
65.961
 
< 0.1%
66.026
< 0.1%
ValueCountFrequency (%)
767.33217
< 0.1%
766.33222
< 0.1%
765.33184
< 0.1%
760.542
 
< 0.1%
759.571
 
< 0.1%
759.541
 
< 0.1%
758.541
 
< 0.1%
757.632
 
< 0.1%
756.632
 
< 0.1%
755.661
 
< 0.1%

preco_base
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1907
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean354.192825
Minimum55.35
Maximum866.27
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2021-08-29T15:37:38.469704image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum55.35
5-th percentile144.53
Q1243.5
median310.46
Q3459.81
95-th percentile668.33
Maximum866.27
Range810.92
Interquartile range (IQR)216.31

Descriptive statistics

Standard deviation160.7554415
Coefficient of variation (CV)0.4538641954
Kurtosis-0.5079199783
Mean354.192825
Median Absolute Deviation (MAD)111.49
Skewness0.6370089178
Sum161596934.5
Variance25842.31198
MonotonicityNot monotonic
2021-08-29T15:37:38.617928image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
292.039509
 
2.1%
290.039381
 
2.1%
291.039377
 
2.1%
280.336837
 
1.5%
282.336561
 
1.4%
281.336476
 
1.4%
445.236304
 
1.4%
446.236262
 
1.4%
447.236229
 
1.4%
486.035607
 
1.2%
Other values (1897)383697
84.1%
ValueCountFrequency (%)
55.351
< 0.1%
64.021
< 0.1%
65.021
< 0.1%
66.021
< 0.1%
72.751
< 0.1%
73.751
< 0.1%
74.751
< 0.1%
75.661
< 0.1%
79.541
< 0.1%
81.542
< 0.1%
ValueCountFrequency (%)
866.272
 
< 0.1%
865.274
 
< 0.1%
864.271
 
< 0.1%
767.33284
0.1%
766.33282
0.1%
765.33292
0.1%
760.541
 
< 0.1%
759.571
 
< 0.1%
759.541
 
< 0.1%
758.542
 
< 0.1%

email_para_promocao
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size445.7 KiB
False
419456 
True
 
36784
ValueCountFrequency (%)
False419456
91.9%
True36784
 
8.1%
2021-08-29T15:37:38.743848image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

pagina_destaque
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size445.7 KiB
False
406617 
True
49623 
ValueCountFrequency (%)
False406617
89.1%
True49623
 
10.9%
2021-08-29T15:37:38.797875image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

qtd_pedido
Real number (ℝ≥0)

HIGH CORRELATION

Distinct994
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.870489116
Minimum2.564949357
Maximum8.466110401
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2021-08-29T15:37:38.893082image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2.564949357
5-th percentile2.63905733
Q13.988984047
median4.912654886
Q35.780743516
95-th percentile6.776506992
Maximum8.466110401
Range5.901161044
Interquartile range (IQR)1.791759469

Descriptive statistics

Standard deviation1.215326969
Coefficient of variation (CV)0.2495287311
Kurtosis-0.6372417251
Mean4.870489116
Median Absolute Deviation (MAD)0.9103910097
Skewness-0.06512203977
Sum2222111.954
Variance1.477019641
MonotonicityNot monotonic
2021-08-29T15:37:39.022735image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.56494935712397
 
2.7%
2.70805020112294
 
2.7%
2.6390573312269
 
2.7%
3.3322045111548
 
2.5%
3.29583686611462
 
2.5%
3.25809653811457
 
2.5%
3.71357206710344
 
2.3%
3.68887945410179
 
2.2%
3.73766961810083
 
2.2%
3.9889840478845
 
1.9%
Other values (984)345362
75.7%
ValueCountFrequency (%)
2.56494935712397
2.7%
2.6390573312269
2.7%
2.70805020112294
2.7%
3.25809653811457
2.5%
3.29583686611462
2.5%
3.3322045111548
2.5%
3.68887945410179
2.2%
3.71357206710344
2.3%
3.73766961810083
2.2%
3.9702919148715
1.9%
ValueCountFrequency (%)
8.4661104011
 
< 0.1%
8.4635814222
< 0.1%
8.4608344583
< 0.1%
8.460622841
 
< 0.1%
8.4578677252
< 0.1%
8.4576554792
< 0.1%
8.4551049993
< 0.1%
8.4546792861
 
< 0.1%
8.4521211951
 
< 0.1%
8.4519077251
 
< 0.1%

categoria
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size446.3 KiB
Beverages
127876 
Rice Bowl
33215 
Sandwich
33205 
Pizza
33138 
Starters
29941 
Other values (9)
198865 

Length

Max length12
Median length8
Mean length7.530481764
Min length4

Characters and Unicode

Total characters3435707
Distinct characters31
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBeverages
2nd rowBeverages
3rd rowBeverages
4th rowBeverages
5th rowBeverages

Common Values

ValueCountFrequency (%)
Beverages127876
28.0%
Rice Bowl33215
 
7.3%
Sandwich33205
 
7.3%
Pizza33138
 
7.3%
Starters29941
 
6.6%
Other Snacks29379
 
6.4%
Desert29294
 
6.4%
Salad28545
 
6.3%
Pasta27694
 
6.1%
Seafood26915
 
5.9%
Other values (4)57038
12.5%

Length

2021-08-29T15:37:39.330649image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
beverages127876
24.6%
bowl33215
 
6.4%
rice33215
 
6.4%
sandwich33205
 
6.4%
pizza33138
 
6.4%
starters29941
 
5.8%
other29379
 
5.7%
snacks29379
 
5.7%
desert29294
 
5.6%
salad28545
 
5.5%
Other values (6)111647
21.5%

Most occurring characters

ValueCountFrequency (%)
e561666
16.3%
a427108
12.4%
r280607
 
8.2%
s267933
 
7.8%
B181705
 
5.3%
S160660
 
4.7%
t159811
 
4.7%
i150973
 
4.4%
v127876
 
3.7%
g127876
 
3.7%
Other values (21)989492
28.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2854279
83.1%
Uppercase Letter518834
 
15.1%
Space Separator62594
 
1.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e561666
19.7%
a427108
15.0%
r280607
9.8%
s267933
9.4%
t159811
 
5.6%
i150973
 
5.3%
v127876
 
4.5%
g127876
 
4.5%
o99720
 
3.5%
c95799
 
3.4%
Other values (12)554910
19.4%
Uppercase Letter
ValueCountFrequency (%)
B181705
35.0%
S160660
31.0%
P60832
 
11.7%
R33215
 
6.4%
O29379
 
5.7%
D29294
 
5.6%
E13562
 
2.6%
F10187
 
2.0%
Space Separator
ValueCountFrequency (%)
62594
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3373113
98.2%
Common62594
 
1.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e561666
16.7%
a427108
12.7%
r280607
 
8.3%
s267933
 
7.9%
B181705
 
5.4%
S160660
 
4.8%
t159811
 
4.7%
i150973
 
4.5%
v127876
 
3.8%
g127876
 
3.8%
Other values (20)926898
27.5%
Common
ValueCountFrequency (%)
62594
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3435707
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e561666
16.3%
a427108
12.4%
r280607
 
8.2%
s267933
 
7.8%
B181705
 
5.3%
S160660
 
4.7%
t159811
 
4.7%
i150973
 
4.4%
v127876
 
3.7%
g127876
 
3.7%
Other values (21)989492
28.8%

cozinha
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size445.9 KiB
Italian
122825 
Thai
118203 
Indian
112419 
Continental
102793 

Length

Max length11
Median length6
Mean length6.877573207
Min length4

Characters and Unicode

Total characters3137824
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowThai
2nd rowThai
3rd rowThai
4th rowThai
5th rowThai

Common Values

ValueCountFrequency (%)
Italian122825
26.9%
Thai118203
25.9%
Indian112419
24.6%
Continental102793
22.5%

Length

2021-08-29T15:37:39.668746image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:37:39.801390image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
italian122825
26.9%
thai118203
25.9%
indian112419
24.6%
continental102793
22.5%

Most occurring characters

ValueCountFrequency (%)
n656042
20.9%
a579065
18.5%
i456240
14.5%
t328411
10.5%
I235244
 
7.5%
l225618
 
7.2%
T118203
 
3.8%
h118203
 
3.8%
d112419
 
3.6%
C102793
 
3.3%
Other values (2)205586
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2681584
85.5%
Uppercase Letter456240
 
14.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n656042
24.5%
a579065
21.6%
i456240
17.0%
t328411
12.2%
l225618
 
8.4%
h118203
 
4.4%
d112419
 
4.2%
o102793
 
3.8%
e102793
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
I235244
51.6%
T118203
25.9%
C102793
22.5%

Most occurring scripts

ValueCountFrequency (%)
Latin3137824
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n656042
20.9%
a579065
18.5%
i456240
14.5%
t328411
10.5%
I235244
 
7.5%
l225618
 
7.2%
T118203
 
3.8%
h118203
 
3.8%
d112419
 
3.6%
C102793
 
3.3%
Other values (2)205586
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII3137824
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n656042
20.9%
a579065
18.5%
i456240
14.5%
t328411
10.5%
I235244
 
7.5%
l225618
 
7.2%
T118203
 
3.8%
h118203
 
3.8%
d112419
 
3.6%
C102793
 
3.3%
Other values (2)205586
 
6.6%

cidade_codigo
Real number (ℝ≥0)

HIGH CORRELATION

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean601.5494455
Minimum456
Maximum713
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2021-08-29T15:37:39.962958image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum456
5-th percentile478
Q1553
median596
Q3651
95-th percentile700
Maximum713
Range257
Interquartile range (IQR)98

Descriptive statistics

Standard deviation66.20361943
Coefficient of variation (CV)0.1100551583
Kurtosis-0.7912046047
Mean601.5494455
Median Absolute Deviation (MAD)53
Skewness-0.2090812318
Sum274450919
Variance4382.919226
MonotonicityNot monotonic
2021-08-29T15:37:40.128948image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
59054635
 
12.0%
52643508
 
9.5%
63820028
 
4.4%
52213456
 
2.9%
51713105
 
2.9%
60413052
 
2.9%
69912089
 
2.6%
64711818
 
2.6%
57611451
 
2.5%
61411330
 
2.5%
Other values (41)251768
55.2%
ValueCountFrequency (%)
4566713
 
1.5%
4615762
 
1.3%
4735853
 
1.3%
4785021
 
1.1%
4855707
 
1.3%
5155084
 
1.1%
51713105
 
2.9%
52213456
 
2.9%
52643508
9.5%
5414501
 
1.0%
ValueCountFrequency (%)
7136849
1.5%
7036703
1.5%
7025264
1.2%
7006883
1.5%
69912089
2.6%
6986433
1.4%
6955294
1.2%
6934627
 
1.0%
6856983
1.5%
6835296
1.2%

regiao_codigo
Real number (ℝ≥0)

HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.61246274
Minimum23
Maximum93
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2021-08-29T15:37:40.264585image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum23
5-th percentile34
Q134
median56
Q377
95-th percentile85
Maximum93
Range70
Interquartile range (IQR)43

Descriptive statistics

Standard deviation17.64357046
Coefficient of variation (CV)0.3116552365
Kurtosis-1.05161428
Mean56.61246274
Median Absolute Deviation (MAD)21
Skewness0.05627398237
Sum25828870
Variance311.2955787
MonotonicityNot monotonic
2021-08-29T15:37:40.375290image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
56191022
41.9%
34116686
25.6%
7794580
20.7%
8530268
 
6.6%
236433
 
1.4%
716278
 
1.4%
935709
 
1.3%
355264
 
1.2%
ValueCountFrequency (%)
236433
 
1.4%
34116686
25.6%
355264
 
1.2%
56191022
41.9%
716278
 
1.4%
7794580
20.7%
8530268
 
6.6%
935709
 
1.3%
ValueCountFrequency (%)
935709
 
1.3%
8530268
 
6.6%
7794580
20.7%
716278
 
1.4%
56191022
41.9%
355264
 
1.2%
34116686
25.6%
236433
 
1.4%

loja_tipo
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size445.8 KiB
TYPE_A
262689 
TYPE_C
99544 
TYPE_B
94007 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters2737440
Distinct characters8
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTYPE_C
2nd rowTYPE_C
3rd rowTYPE_C
4th rowTYPE_C
5th rowTYPE_C

Common Values

ValueCountFrequency (%)
TYPE_A262689
57.6%
TYPE_C99544
 
21.8%
TYPE_B94007
 
20.6%

Length

2021-08-29T15:37:40.670341image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-29T15:37:40.759038image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
type_a262689
57.6%
type_c99544
 
21.8%
type_b94007
 
20.6%

Most occurring characters

ValueCountFrequency (%)
T456240
16.7%
Y456240
16.7%
P456240
16.7%
E456240
16.7%
_456240
16.7%
A262689
9.6%
C99544
 
3.6%
B94007
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter2281200
83.3%
Connector Punctuation456240
 
16.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T456240
20.0%
Y456240
20.0%
P456240
20.0%
E456240
20.0%
A262689
11.5%
C99544
 
4.4%
B94007
 
4.1%
Connector Punctuation
ValueCountFrequency (%)
_456240
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2281200
83.3%
Common456240
 
16.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
T456240
20.0%
Y456240
20.0%
P456240
20.0%
E456240
20.0%
A262689
11.5%
C99544
 
4.4%
B94007
 
4.1%
Common
ValueCountFrequency (%)
_456240
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2737440
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T456240
16.7%
Y456240
16.7%
P456240
16.7%
E456240
16.7%
_456240
16.7%
A262689
9.6%
C99544
 
3.6%
B94007
 
3.4%

op_area
Real number (ℝ≥0)

HIGH CORRELATION

Distinct30
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.083081711
Minimum0.9
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.5 MiB
2021-08-29T15:37:40.891191image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.9
5-th percentile2.7
Q13.6
median4
Q34.5
95-th percentile6.7
Maximum7
Range6.1
Interquartile range (IQR)0.9

Descriptive statistics

Standard deviation1.091530289
Coefficient of variation (CV)0.2673300134
Kurtosis1.452919071
Mean4.083081711
Median Absolute Deviation (MAD)0.5
Skewness0.6645434202
Sum1862865.2
Variance1.191438371
MonotonicityNot monotonic
2021-08-29T15:37:41.025832image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
452514
 
11.5%
3.948523
 
10.6%
3.840066
 
8.8%
4.426030
 
5.7%
4.525652
 
5.6%
2.825500
 
5.6%
4.123330
 
5.1%
720660
 
4.5%
4.818642
 
4.1%
3.417257
 
3.8%
Other values (20)158066
34.6%
ValueCountFrequency (%)
0.93432
 
0.8%
1.94083
 
0.9%
29512
 
2.1%
2.45021
 
1.1%
2.712427
2.7%
2.825500
5.6%
2.94711
 
1.0%
311182
2.5%
3.26333
 
1.4%
3.417257
3.8%
ValueCountFrequency (%)
720660
4.5%
6.77026
 
1.5%
6.37006
 
1.5%
5.66983
 
1.5%
5.36051
 
1.3%
5.113310
2.9%
56161
 
1.4%
4.818642
4.1%
4.75975
 
1.3%
4.65975
 
1.3%

Interactions

2021-08-29T15:37:06.413618image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:06.668640image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:06.902382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:07.131725image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:07.383114image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:07.628458image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:07.893750image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:08.146748image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:08.396689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:08.633248image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:08.872685image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:09.135250image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:09.483388image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:09.820484image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:10.102523image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:10.399825image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:10.692039image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:11.089369image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:11.426469image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:11.795236image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:12.056023image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:12.308348image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:12.565660image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:12.821048image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:13.099995image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:13.360157image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:13.634869image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:13.916431image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:14.208576image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:14.531223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:14.848375image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:15.210407image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:15.515792image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:15.798900image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:16.088126image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:16.366383image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:16.652618image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:16.948043image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:17.224712image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:17.508313image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:17.777593image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:18.035903image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:18.304186image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:18.564998image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:18.835976image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:19.099098image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:19.377157image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:19.667185image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:19.947701image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:20.235931image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:20.511195image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:20.796080image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:21.093286image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:21.388497image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:21.670742image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:21.940010image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:22.240013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:22.540015image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:22.831856image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:23.124542image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:23.459956image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:23.704827image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:24.049758image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:24.300012image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:24.580725image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:24.851732image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:25.120015image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:25.400013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:25.675215image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:25.937518image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:26.237295image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:26.529514image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:26.840683image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:27.147219image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:27.464066image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:27.782146image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:28.112284image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:28.447407image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:28.769150image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:29.080940image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:29.357202image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:29.769102image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:30.059325image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:30.352890image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:30.655087image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:30.954230image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:31.262181image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:31.579333image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:31.877536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:32.173748image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:32.455389image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:32.719321image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:32.979625image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:33.253893image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:33.537135image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:33.810047image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:34.100812image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:34.389937image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:34.700046image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-08-29T15:37:34.960039image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-08-29T15:37:41.183410image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-08-29T15:37:41.705213image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-08-29T15:37:42.074233image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-08-29T15:37:42.385400image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-08-29T15:37:42.654837image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-08-29T15:37:35.219848image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-08-29T15:37:35.835329image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexsemanaloja_idrefeicao_idpreco_checkoutpreco_baseemail_para_promocaopagina_destaqueqtd_pedidocategoriacozinhacidade_codigoregiao_codigoloja_tipoop_area
001551885136.83152.29FalseFalse5.176150BeveragesThai64756TYPE_C2.0
112551885135.83152.29FalseFalse5.777652BeveragesThai64756TYPE_C2.0
223551885132.92133.92FalseFalse4.564348BeveragesThai64756TYPE_C2.0
334551885135.86134.86FalseFalse5.093750BeveragesThai64756TYPE_C2.0
445551885146.50147.50FalseFalse5.370638BeveragesThai64756TYPE_C2.0
556551885146.53146.53FalseFalse5.652489BeveragesThai64756TYPE_C2.0
667551885145.53146.53FalseFalse4.997212BeveragesThai64756TYPE_C2.0
778551885146.53145.53FalseFalse4.905275BeveragesThai64756TYPE_C2.0
889551885134.83134.83FalseFalse5.164786BeveragesThai64756TYPE_C2.0
9910551885144.56143.56FalseFalse5.164786BeveragesThai64756TYPE_C2.0

Last rows

df_indexsemanaloja_idrefeicao_idpreco_checkoutpreco_baseemail_para_promocaopagina_destaqueqtd_pedidocategoriacozinhacidade_codigoregiao_codigoloja_tipoop_area
456230456538136612104571.33573.33FalseFalse2.708050FishContinental47377TYPE_A4.5
456231456539137612104631.53631.53FalseFalse3.713572FishContinental47377TYPE_A4.5
456232456540138612104631.53630.53FalseTrue5.010635FishContinental47377TYPE_A4.5
456233456541139612104490.82629.53FalseTrue5.743003FishContinental47377TYPE_A4.5
456234456542140612104485.03629.53FalseTrue5.003946FishContinental47377TYPE_A4.5
456235456543141612104583.03630.53FalseTrue2.564949FishContinental47377TYPE_A4.5
456236456544142612104581.03582.03FalseFalse3.737670FishContinental47377TYPE_A4.5
456237456545143612104583.03581.03FalseFalse3.688879FishContinental47377TYPE_A4.5
456238456546144612104582.03581.03FalseFalse3.970292FishContinental47377TYPE_A4.5
456239456547145612104581.03582.03FalseFalse3.295837FishContinental47377TYPE_A4.5